A New Supervised Learning Algorithm for Word Sense Disambiguation
نویسندگان
چکیده
The Naive Mix is a new supervised learning algorithm that is based on a sequential method for selecting probabilistic models. The usual objective of model selection is to nd a single model that adequately characterizes the data in a training sample. However, during model selection a sequence of models is generated that consists of the best{{tting model at each level of model complexity. The Naive Mix utilizes this sequence of models to deene a probabilistic model which is then used as a probabilistic classiier to perform word{sense disambiguation. The models in this sequence are restricted to the class of decomposable log{linear models. This class of models ooers a number of computational advantages. Experiments dis-ambiguating twelve diierent words show that a Naive Mix formulated with a forward sequential search and Akaike's Information Criteria rivals established supervised learning algorithms such as decision trees (C4.5), rule induction (CN2) and nearest{neighbor classiica-tion (PEBLS).
منابع مشابه
Learning Probabilistic Models of Word Sense Disambiguation
This dissertation presents several new methods of supervised and unsupervised learning of word sense disambiguation models. The supervised methods focus on performing model searches through a space of probabilistic models, and the unsupervised methods rely on the use of Gibbs Sampling and the Expectation Maximization (EM) algorithm. In both the supervised and unsupervised case, the Naive Bayesi...
متن کاملWord Sense Disambiguation by Semi-supervised Learning
In this paper we propose to use a semi-supervised learning algorithm to deal with word sense disambiguation problem. We evaluated a semi-supervised learning algorithm, local and global consistency algorithm, on widely used benchmark corpus for word sense disambiguation. This algorithm yields encouraging experimental results. It achieves better performance than orthodox supervised learning algor...
متن کاملInvestigating Problems of Semi-supervised Learning for Word Sense Disambiguation
Word Sense Disambiguation (WSD) is the problem of determining the right sense of a polysemous word in a given context. In this paper, we will investigate the use of unlabeled data for WSD within the framework of semi supervised learning, in which the original labeled dataset is iteratively extended by exploiting unlabeled data. This paper addresses two problems occurring in this approach: deter...
متن کاملWord embeddings and recurrent neural networks based on Long-Short Term Memory nodes in supervised biomedical word sense disambiguation
Word sense disambiguation helps identifying the proper sense of ambiguous words in text. With large terminologies such as the UMLS Metathesaurus ambiguities appear and highly effective disambiguation methods are required. Supervised learning algorithm methods are used as one of the approaches to perform disambiguation. Features extracted from the context of an ambiguous word are used to identif...
متن کاملWord Sense Disambiguation with the KORA-Ω Algorithm
We present a method for word sense disambiguation (WSD) based on the KORA-Ω supervised learning algorithm. The advantage of the method is its simplicity and a very small feature set used, though, as we show, this is achieved at the cost of lower accuracy of the final result than the complex stateof-the-art methods achieve.
متن کاملA Comparison between Supervised Learning Algorithms for Word Sense Disambiguation
This paper describes a set of comparative experiments, including cross{corpus evaluation, between ve alternative algorithms for supervised Word Sense Disambiguation (WSD), namely Naive Bayes, Exemplar-based learning, SNoW, Decision Lists, and Boosting. Two main conclusions can be drawn: 1) The LazyBoosting algorithm outperforms the other four state-of-theart algorithms in terms of accuracy and ...
متن کامل